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Abstract. An asymptotic model for extreme behavior of certain Markov chains is the "tail chain" . 
Generally taking the form of a multiplicative random walk, it is useful in deriving extremal char- 
acteristics such as point process limits. We place this model in a more general context, formulated 
in terms of extreme value theory for transition kernels, and extend it by formalizing the distinc- 
tion between extreme and non-extreme states. We make the link between the update function and 
transition kernel forms considered in previous work, and we show that the tail chain model leads to 
a multivariate regular variation property of the finite-dimensional distributions under assumptions 
on the marginal tails alone. 



1. Introduction 

A method of approximating the extremal behavior of discrete-time Markov chains is to use 
an asymptotic process cahed the tail chain under an asymptotic assumption on the transition 
kernel of the chain. Loosely speaking, if the distribution of the next state converges under some 
normalization as the current state becomes extreme, then the Markov chain behaves approximately 
as a multiplicative random walk upon leaving a large initial state. This approach leads to intuitive 
extremal models in such cases as autoregressive processes with random coefficients, which include a 
class of ARCH models. The focus on Markov kernels was introduced by Smith [24] • Perfekt [18^119) 
extended the approach to higher dimensions, and Segers ^23j rephrased the conditions in terms of 
update functions. 

Though not restrictive in practice, the previous approach tends to mask aspects of the processes' 
extremal behaviour. Markov chains which admit the tail chain approximation fall into one of two 
categories. Starting from an extreme state, the chain either remains extreme over any finite time 
horizon, or will drop to a "non-extreme" state of lower order after a finite amount of time. The 
latter case is problematic in that the tail chain model is not sensitive to possible subsequent jumps 
from a non-extreme state to an extreme one. Previous developments handle this by ruling out 
the class of processes exhibiting this behaviour via a technical condition, which we refer to as the 
regularity condition. Also, most previous work has assumed stationarity, since interest focused on 
computing the extremal index or deriving limits for the exceedance point processes, drawing on 
the theory established for stationary processes with mixing by Leadbetter et al. [T7]. However, 
stationarity is not fundamental in determining the extremal behaviour of the finite-dimensional 
distributions. 

We place the tail chain approximation in the context of an extreme value theory for Markovian 
transition kernels, which a priori does not necessitate any such restrictions on the class of processes 
to which it may be applied. In particular, we introduce the concept of boundary distribution, which 
controls tail chain transitions from non-extreme to extreme. Although distributional convergence 
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results are more naturally phrased in terms of transition kernels, we treat the equivalent update 
function forms as an integral component to interfacing with applications, and we phrase relevant 
assumptions in terms of both. While not making explicit a complete tail chain model for the class 
of chains excluded previously, we demonstrate the extent to which previous models may be viewed 
as a partial approximation within our framework. This is accomplished by formalizing the division 
between extreme and non-extreme states as a level we term the extremal boundary. We show 
that, in general, the tail chain approximates the extremal component, the portion of the original 
chain having yet to cross below this boundary. Phrased in these terms, the regularity condition 
requires that the distinction between the original chain and its extremal component disappears 
asymptotically. 

After introducing our extreme value theory for transition kernels, along with a representation in 
terms of update functions, we derive limits of finite-dimensional distributions conditional on the 
initial state, as it becomes extreme. We then examine the effect of the regularity condition on these 
results. Finally, adding the assumption of marginal regularly varying tails leads to convergence 
results for the unconditional distributions akin to regular variation. 

1.1. Notation and Conventions. We review notation and relevant concepts. If not explicitly 
specified, assume that any space S under discussion is a topological space paired with its Borel 
cj-field of open sets B{S) to form a measurable space. Denote by /C(S) the collection of its compact 
sets; by C(S) the space of real- valued continuous, bounded functions on S; and by C]^(S) the space 
of non-negative continuous functions with compact support. Weak convergence of probability 
measures is represented by =^. 

For a space E which is locally compact with countable base (for example, a subset of [— c«, c«]'^), 
M_|_(E) is the space of non-negative Radon measures on i3(E); point measures consisting of single 
point masses at x will be written as ex{-)- A sequence of measures {^n} C M_|_(E) converges 
vaguely to /i G M_|_(E) (written ^„ A fx) if J^fdfXn /g / d/x as n — )• oo for any / G C^(E). 
The shorthand /u(/) = J f dfx is handy. That the distribution of a random vector X is regularly 
varying on a cone E C [-00, oo]'^\{0} means that tP[X/b{t) G •] 4> /i*(-) in M+(E) as i — ;> 00 for 
some non-degenerate limit measure fi* G M+(E) and scaling function b{t) — )• 00. The limit /x* is 
necessarily homogeneous in the sense that fJ^*{c-) = c~°/i*(-) for some a > 0. The regular variation 
is standard if b{t) = t. 

li X = (Xq, Xi,X2, . . . ) is a (homogeneous) Markov chain and K is a. Markov transition kernel, 
we write X ^ K to mean that the dependence structure of X is specified by K, i.e. 

P[Xn+ie ■\Xn = x] = K{x, ■), n = 0,l,... . 

We adopt the standard shorthand Px[{Xi, . . . , Xm) G •] = P[{Xi, . . . , X^) G -{Xq = x]. Some 
useful technical results are assembled in Section [8] (p. [2T]) . 



2. Extremal Theory for Markov Kernels 



We begin by focusing on the Markov transition kernels rather than the stochastic processes they 
determine, and introduce a class of kernels we term "tail kernels," which we will view as scaling 
limits of certain kernels. Antecedents include Segers' [23j definition of "back-and-forth tail chains" 
that approximate certain Markov chains started from an extreme value. 

For a Markov chain X K on [0,oo), it is reasonable to expect that extremal behaviour of X 
is determined by pairs (X„,X„+i), and one way to control such pairs is to assume that (X„,X„+i) 
belongs to a bivariate domain of attraction (cf. [3 [23]). In the context of regular variation, writing 
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suggests combining marginal regular variation of with a scaling kernel limit to derive extremal 
properties of the finite-dimensional distributions (fdds) [TSj IT9t 123] . and this is the direction we 
take. We first discuss the kernel scaling operation. 

For simplicity, we assume the state space of the Markov chain is [0, oo), although with suitable 
modifications, it is relatively straightforward to extend the results to W^. Henceforth G and H will 
denote probability distributions on [0,oo). 

2.1. Tail Kernels. The tail kernel associated with G, with boundary distribution H, is 

'G(y-iA) y>0 



(2.2) K*{y,A) 



H{A) y = 



for any measurable set A. Thus, the class of tail kernels on [0, oo) is parameterized by the pair of 
probability distributions {G,H). Such kernels are characterized by a scaling property: 

Proposition 2.1. A Markov transition kernel K is a tail kernel associated with some {G,H) if 
and only if it satisfies the relation 

(2.3) K{uy, A) =K{y,u-^A) 

when y > for any u > 0, in which case G{-) = K{1 , •). The property ()2.3p extends to y = iff 
H = eo. 

Proof. If A' is a tail kernel, (12. 3p follows directly from the definition. Conversely, assuming (12. 3p . 
for y > we can write 

K{y,A)=K{l,y-'A), 

demonstrating that X is a tail kernel associated with K{1 , •) (with boundary distribution H = 
K{0 , •)). To verify the second assertion, fixing u > 0, we must show that H{u~^-) = H{-) iff i7 = eo. 
On the one hand, we have eo(n~^j4) = eo(^). On the other, H{0,oo) = linin^oo H {n^^ , oo) = 
H{l,oo), so H(0, 1] = 0. A similar argument shows that H{l,oo) = as well. □ 

We call the Markov chain T ~ K* a tail chain associated with (G,H). Such a chain can be 
represented as 

(2.4) Tn = Tn-1 + C l{T„_i=o} for n = 1, 2, . . . , 

where ^„ ~ G and ~ H are independent of each other and of Tq. If = eq, then T becomes a 
multiplicative random walk with step distribution G and absorbing barrier at {0}: r„ = ?o • • • 

2.2. Convergence to Tail Kernels. The tail chain approximates the behaviour of a Markov 
chain X K in extreme states. Asymptotic results require that the normalized distribution of Xi 
be well-approximated by some distribution G when Xq is large, and we interpret this requirement 
as a domain of attraction condition for kernels. 

Definition. A Markov transition kernel K : [0, oo) x B[0, oo) — )• [0, 1] is in the domain of attraction 
of G, written K € D(G), if as t — )■ oo, 

(2.5) K{t,t-) ^ G(-) on [0,oo]. 

Note that D{G) contains at least the class of tail kernels associated with G (i.e. with any boundary 
distribution H). A simple scaling argument extends ()2.5p to 

(2.6) K{tu,t-) ^ G{u~^-) =: K*{u, ■), n > 0, 

where K* is any tail kernel associated with G; this is the form appearing in (12. 1|) . Thus tail kernels 
are scaling limits for kernels in a domain of attraction. In fact, tail kernels are the only possible 
limits: 
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Proposition 2.2. Let K be a transition kernel and H he an arbitrary distribution on [0,oo). // 
for each n > there exists a distribution Gu such that K(tu, t • ) =^ as t ^ oo, then the 

function K defined on [0, oo) x i3[0,oo) as 



K{u, A) :-- 



Gu{A) n > 
H{A) u = 



is a tail kernel associated with Gi. 



Proof. It suffices to show that Gu{-) = Gi{u for any u> 0. But this follows directly from the 
uniqueness of weak limits, since ()2.6p shows that K{tu, t-) ^ Gi{u~^-). □ 



A version of (j2.6p uniform in u is needed for fdd convergence results. 

Proposition 2.3. Suppose K £ D{G), and K* is a tail kernel associated with G. Then, for any 
li > and any non-negative function Ut = u{t) such that ut ^ u as t ^ oo, we have 

(2.7) K{tut,t-) ^ K*{u, ■), {t^oo). 

Proof. Suppose — )• u > 0. Observe that K{tut , t ■) = K{tut , (tut) W^^-), and put ht{x) = utx, 
h{x) = ux. Writing Pt{-) = K{tut , tut •), we have 

K{tut, t-) =Ptoht^ ^Go/j-i = G{u'^-) = K*{u, •) 

by [21 Theorem 5.5, p. 34]. □ 

The measure G controls X upon leaving an extreme state, and H describes the possibility of 
jumping from a non-extreme state to an extreme one. The traditional assumption (j2.5p provides 
no information about H, and in fact (j2.7p may fail if u = — see Example 16.21 However, the choice 
of H cannot be ignored if is an accessible point of the state space, especially for cases where 
^^({0}) = K*{y , {0}) > 0. We propose pursuing implications of the traditional assumption (j2.5p 
alone, and will add conditions as needed to understand boundary behaviour of X. 

Alternative, more general formulations of ()2.5p include replacing K[t, t ■) with K{t, a{t) ■) or 
K{t , a(t) ■ + b{t)) with appropriate functions a{t) > and b{t), in analogy with the usual domains 
of attraction conditions in extreme value theory. Indeed, the second choice coincides with the 
original presentation by Perfekt |18j , and relates to the conditional extreme value model [8l [131 E] • 
For clarity, and to maintain ties with regular variation, we retain the standard choice a{t) = t, 
bit) = 0. 

2.3. Representation. How do we characterize kernels belonging to D{G)1 From (j2.4p . for chains 
transitioning according to a tail kernel, the next state is a random multiple of the previous one, 
provided the prior state is non-zero. We expect that chains transitioning according to K £ D{G) 
behave approximately like this upon leaving a large state, and this is best expressed in terms of a 
function describing how a new state depends on the prior one. 

Given a kernel K, we can always find a sample space E, a measurable function ip : [0, oo) x E — ?• 
[0, oo) and an E-valued random element V such that V) ~ K(y , • ) for all y. Given a random 
variable Xq, if we define the process X = [Xq, Xi, X2, . . .) recursively as 

Xn+l = IpiXn, Vn+l), n > 0, 

where {Vn} is an iid sequence equal in distribution to V and independent of Xq, then X is a Markov 
chain with transition kernel K. Call the function -i/; an update function corresponding to K. If in 
addition K G D{G), the domain of attraction condition (|2.5p becomes 

t-'^{t,v)^C, 
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where ^ ~ G. Applying the probabihty integral transform or the Skorohod representation theorems 
[21 Theorem 3.2, p. 6], [4, Theorem 6.7, p. 70], we get the following result. 

Proposition 2.4. If K is a transition kernel, K € D{G) if and only if there exists a measurable 
function tp* : [0, oo) x [0, 1] — t- [0, oo) and a random variable ^* ^ G on the uniform probability 
space ([0, l],i3,A) such that 

(2.8) t-^^lj*{t,u) ^ C{u) Vug [0,1] 

as t ^ CO, and ip* is an update function corresponding to K in the sense that 

X[r{y,-)^A]=K{y,A) 

for measurable sets A. 

Think of the update function as ip*{y, U) where U (u) = u is a uniform random variable on [0, 1]. 

Proof. If there exist such ip* and ^* satisfying (j2.8p then clearly K G D{G). Conversely, suppose 
'(/'(•, V) is an update function corresponding to K. According to Skorohod's representation theorem 
(cf. Billingsley [4J p. 70, with the necessary modifications to allow for an uncountable index set), 
there exists a random variable ^* and a stochastic process {Y^ ; t > 0} defined on the uniform 
probability space ([0, 1],^, A), taking values in [0, oo), such that 

yo* = ^(0,^), r;=t-V(t,^) fort>0, 

and Y^{u) — ?■ ^*(ti) as t — oo for every u G [0, 1]. Now, define ip* : [0, oo) x [0, 1] — [0, oo) as 

i)*{0,u) =Yt;{u) and i;* {t,u) = tY^ (u) , t > 0, VnG[0,l]. 

It is evident that X[ip*{y, •) £ A] = P[tp{y, V) G A] for y G [0, oo), so ^p* is indeed an update function 
corresponding to K, and ip* satisfies ()2.8p by construction. □ 

Update functions corresponding to K are not unique, and some of them may fail to converge 
pointwise as in (|2.8p . However (j2.8p is convenient, and Proposition 12.41 shows that Segers' [23] 
Condition 2.2 in terms of update functions is equivalent to our weak convergence formulation 
K G D{G). 

Pointwise convergence in ()2.8p gives an intuitive representation of kernels in a domain of attrac- 
tion. 

Corollary 2.1. K G D{G) iff there exists a random variable ^ G defined on the uniform 
probability space, and a measurable function : [0, oo) x [0, 1] — )• (— oo, oo) satisfying t~^4'(t, u) — )• 
for all u G [0, 1] such that 

(2.9) i;{y,u) ■.= ^{u)y + ^iy,u) 
is an update function corresponding to K . 

Proof. If such ^ and 4> exist, then t~^'p{t,u) = ^{u) + t~^(p{t,u) — )• ^(u) for all u, so tp satisfies 
(j2.8p . The converse follows from (j2.8p . □ 

Many Markov chains such as ARCH, GARCH and autoregressive processes are specified by 
structured recursions that allow quick recognition of update functions corresponding to kernels in 
a domain of attraction. A common example is the update function ^{y, {Z, W)) = Zy + W, which 
behaves like ip'{y, Z) = Zy when y is large — compare ip' to the form (j2.4p discussed for tail kernels. 
In general, if K has an update function tp of the form 

(2.10) ijiy,{Z,W)) = Zy + ^{y,W) 

for a random variable Z > and a random element W, where t'^^4i{t, w) — )• whenever w £ G 
for which P[W G C] = 1, then K G D{G) with G = P[Z G •]. We will refer to update functions 
satisfying ()2.10p as being in canonical form. 
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3. Finite-Dimensional Convergence and the Extremal Component 

Given a Markov chain X ^ K £ D[G), we show that the finite-dimensional distributions (fdds) 
of X, started from an extreme state, converge to those of the tail chain T defined in (j2.4p . We 
initially develop results that depend only on G (but not H), and then clarify what behaviour of X 
is controlled by G and H respectively. We make explicit links with prior work that did not consider 
the notion of boundary distribution. 

If G({0}) = 0, the choice of H is inconsequential, since P[T eventually hits {0}] = and T is 
indistinguishable from the multiplicative random walk {T* = 2o^i---^„,n > 0} (where Tq > 
and are iid ~ G and independent of Tq). In this case, assume without loss of generality 

that H = eo. However, if G({0}) > 0, any result not depending on H must be restricted to 
fdds conditional on the tail chain not having yet hit {0}. For example, consider the trajectory of 
{Xi, . . . , Xm), started from Xq = t, through the region (t, oo)'"-2 X [0, 5] X (t, oo), where t is a high 
level. The tail chain would model this as a path through (0,oo)™~^ x {0} x (0, oo), which requires 
specifying H to control transitions away from {0}. 

This raises the question of how to interpret the first hitting time of {0} for T in terms of the 
original Markov chain X. Such hitting times are important in the study of Markov chain point 
process models of exceedance clusters based on the tail chain. Intuitively, a transition to {0} by T 
represents a transition from an extreme state to a non-extreme state by X. We make this notion 
precise in Section 13.21 by viewing such transitions as downcrossings of a certain level we term the 
"extremal boundary." 

We assume X is a Markov chain on [0, oo) with transition kernel K E D{G), K* is a tail kernel 
associated with G with unspecified boundary distribution H, and T is a Markov chain on [0, oo) 
with kernel K* . The finite-dimensional distributions of X , conditional on Xq = y, are given by 

Py[{Xi, ... , Xm) G dxm] = K[y , dxi)K[x , dx2) ■■■K{ 

and analogously for T. 



3.1. FDDs Conditional on the Intial State. Define the conditional distributions 

Xi Xm 
t t 



(3.1) 4H^^-) = ^tu 



^G- and Trm{u , ■) = Pu[{Ti, . . . ,Tm) £ ■], m > 1, 



on [0, oo) X B[0, oo]"^. We consider when vTm =^ TT-m on [0, oo]"^ pointwise in u. If G{{0}) = 0, this is 
a direct consequence of the domain of attraction condition (j2.5p . but if G({0}) > 0, more thought 
is required. We begin by restricting the convergence to the smaller space := (0, oo]™""^ x [0, oo]. 
Relatively compact sets in are contained in rectangles [a, oo] x [0,oo], where a G (0,oo)™^^. 

Theorem 3.1. Let Uf = u{t) be a non-negative function such that ut ^ u > as t ^ oo. 

(a) The restrictions to E^, 

(3.2) (u , • ) := vr^) (n , • n E^) and /^^(n , • ) := 7rm(u , • n E^), 
satisfy 

(3.3) fi^^{ut, ■) ^ ^lmiu, ■) inM+{E'J (t ^ oo). 

(b) //G({0}) =0, we have 

(3.4) 7rW(ut , •) ^ 7r^(u, •) on [0, oo]"" (t ^ oo). 

Proof. The Markov structure suggests an induction argument facilitated by Lemma 18.21 (p. [2T]) . 
Consider (a) first. If m = 1, then (13. 3p above reduces to (|2.7p . Assume m > 2, and let / G C^(E'„). 
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Writing = (0,oo] x ]E'^„_i, we can find a > and B G lC{E'^_i) such that / is supported on 
[a, oo] X B. Now, observe that 

A*m ("t ) •) (/) = / K{tUt , tdXi) / K{tXi , tdX2) ■ ■ ■ K{tXm-l , tdXm) f{Xm) 

J(0,ool JE' , 



K[tUt , tdxi) / ' "^(^2, • • ■,Xm)) f{Xm). 

Defining 

ht{v) = fi':^_^{v , dxm^i) f{v,Xrn-i) and h{v) = flrn-~l{v , dXm^i) f{v,Xm^i) , 

Je' , Je' -, 

m— 1 m— 1 

the previous expression becomes 

{ut , •) (/) = / K{tut , tdv) ht{v). 

J(0,oo] 

Now, suppose ft — )• t> > : we verify 

(3.5) ht{vt) h{v). 

By continuity, we have f{vt,xl^_^) — )• f{v,x„i-i) whenever xl^_^ — )■ Xm-i, and the induction 

hypothesis provides fi^_i{vt , •) — ^ ^rn-i{v , •). Also, /(a;, •) has compact support B (without loss 
of generality, ^rn-i{v , dB) = 0). Combining these facts, (|3.5p follows from Lemma [8^2] (b). Next, 
since the ht and h have common compact support [a, cxd], and recalling from Propostion 12.31 that 
K{tut , t-) ^ K*{u, ■), Lemma [Q (a) yields 

M^^^(^t, •)(/)^ / K*{u,dv) h{v) = fi^{u, ■){/). 

J(0,oo] 

Implication (b) follows from essentially the same argument. For m > 2, suppose / E C[0, cxd]*". 
Replacing ^ by tt and IE^_i by [0, oo]™""^ in the definitions of ht and h, we have 

7rW(u,, .)(/) = / K {tut, tdv) ht{v). 

J[0,oo] 

This time Lemma 18.21 (a) shows that ht{vt) — t- h{v) if — >• v > 0, and since K*{u, (0, oo]) = 1, 
resorting to Lemma 18.21 (a) once more yields 

ttW [ut ,•)(/) ^ / K*{u, dv) h{v)=Tr^{u,-) (/) . □ 

J[0,oo] 

If G({0}) > 0, then K*{u , (0, oo]) = 1 - G({0}) < 1, and for ([331) to hold would require knowing 
the behaviour of ht{vt) when — t- as well. Behaviour near zero is controlled by an asymptotic 
condition related to the boundary distribution H. Previous work handled this using the regularity 
condition discussed in Section [H 

3.2. The Extremal Boundary. The normalization employed in the domain of attraction condi- 
tion (j2.5p suggests that, starting from a large state t, the extreme states are approximately scalar 
multiples of t. For example, we would consider a transition from t into (t/3, 2t\ to remain extreme. 
Thus, we think of states which can be made smaller than t5 for any (5, if t is large enough, as 
non-extreme. In this context, the set [0, ^/i] would consist of non-extreme states. 

Under (|2.5|) . a tail chain path through (0,oo) models the original chain X travelling among 
extreme states, and all of the non-extreme states are compacted into the state {0} in the state space 
of T. Therefore, if X is started from an extreme state, the portion of the tail chain depending solely 
on G is informative up until the first time X crosses down to a non-extreme state. If G({0}) = 0, 
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such a transition would become more and more unlikely as the initial state increases in which case 
G provides a complete description of the behaviour of X in any finite number of steps following a 
visit to an extreme state (Theorem 13.11 (b)). 

Drawing upon this interpretation, we develop a rigorous formulation of the distinction between 
extreme and non-extreme states, and we recast Theorem 13.11 as convergence on the unrestricted 
space [0, oo]'" of the conditional fdds, given that X has not yet reached a non-extreme state. 

Definition. Suppose K G D{G). An extremal boundary for K is a. non-negative function y{t) 
defined on [0, oo), satisfying \\m.t — ooy(i) = and 

(3.6) K(t, t [0,y(t)]) — >G({0}) as t^oo. 

Such a function is guaranteed to exist by Lemma [831 (p. [23l) . 

If GdO}) = 0, then y{t) = is a trivial choice. For any function < y{t) — )• 0, we have 
limsup(_j.oo K(t , t [0, y(t)]) < G({0}), so (13. 6p is equivalent to 

(3.7) liminf Kit , t [0, y{t)]) > G{{0}). 

If y{t) is an extremal boundary, it follows that any function < y{t) — t- with y(t) > y{t) for t >to 
is also an extremal boundary for K. Taking y{t) = Vs>ty(,s) shows that without loss of generality, 
we can assume y{t) to be non-increasing. 

The extremal boundary has a natural formulation in terms of the update function. As in (|2.10p . 
let il^{y, (Z, W)) = Zy + (p{y, W) be an update function in canonical form, where y is extreme. If 
Z > then the next state is approximately Zy, another extreme state. Otherwise, if Z = 0, the 
next state is (l){y,W), and a transition from an extreme to a non-extreme state has taken place. 
This suggests choosing an extremal boundary whose order is between t and (j){t,w). 

Proposition 3.1. Suppose 'ijj{y,{Z,W)) is an update function in canonical form as in (I2.10p . // 
({t) > is a function on [0,oo) such that 

(3.8) (l){t,w)/at) ^0 

as t ^ oo whenever w £ B for which P[W £ B] = I, then liminit^oo K{t , [0, C(i)]) > ^({O}). 
Provided lim(__).oo Cit)/i = 0? '^'^ extremal boundary is given by y{t) := C,{t)/t. 

Thus if (l){t,w) = o(C(i)) and (^{t) = o{t) then C(0/^ is an extremal boundary. For example, if 
ip{y, {Z, W)) = Zy + W, so that (p{t, w) = w, then choosing ({t) to be any function ({t) — oo such 
that C{t) = o{t) makes C{t)/t an extremal boundary. Choosing ((t) = we find that y{t) = l/\/t 
is an extremal boundary. 

Proof. Since 

p [m < at) ,z = o] = p [0(t, w) < m , z = o] > p [m, w) \ < m ,z = o] 

>pr^ = o]-p[MMp>il ^pr^ = o], 

- L J [ ^(t) J L J' 



liminf K(t, [0,C{t)]) = liminf P^it) < C{t)] >P\Z = 0\. □ 



we have 



We will need an extremal boundary for which (j3.6p still holds upon replacing the initial state t 
with tut, where — )• n > 0. Compare the following extension with Proposition [ 



Proposition 3.2. If K £ D[G), then there exists an extremal boundary y*{t) such that 
(3.9) K{tut,t[^,y*{t)]) ^ G{{id]) as t ^ oo 

for any non-negative function ut = u{t) — )■ n > 0. 
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We will refer to y* as a uniform extremal boundary. 

Proof. Let y{t) be an extremal boundary for K. As a first step, fix no > 1, and suppose Uq ^ < 
u < uq. Define y{t) = UQy{tuQ^). Now, if ut — s- u, then := uty{tut) satisfies (|c{.9p . since 

K{tut,t[0,y{^}{t)]) =K {tut, tut [0,y{tut)]) G{{0}). 

Here y^^} depends on the choice of function ut. However, since we eventually have Uq^ < ut < uq 
for t large enough, it follows that y{t) > for such t. Hence, y{t) satisfies ()3.9p for any ut ^ u 

with Uq^ < u < uq. 

Next, we remove the restriction in no via a diagonalization argument. For k = 2,3, let yk{t) 
be extremal boundaries such that K{tut , t [0, yk{t)]) — >■ G{{0}) whenever n^ — )• n for n G {k~^, k), 
and put yo = yi = y. Next, define the sequence {{sk,Xk) ■ k = 0,1, . . .} inductively as follows. 
Setting So = and xo = ?/o(l), choose > s^-i + 1 such that yj{t) < k^^ Axfc_i for all j = 0, . . . , A: 
whenever t > s^, and put = max{?/j(sfc) : j = 0, . . . , k}. Note that < k^^ A Xk~i, so x^ i 0, 
and Sfc t CO. Finally, set 

oo 

y*it) =Y,'^klls„s,+,){t)■ 
k=0 

Observe that < y*{t) | 0, and suppose nt — >• n > 0. Then n € (A;(^^,A;o) for some ko, so 
K{tut, i [0, 2/A;o(i)]) — ^" GdO}), and for k > ko, our construction ensures that whenever Sk < t < 
Sk+i, we have yko{t) < ytoisk) < Xk = y*{t). Therefore, y*{t) > yko{t) for t > suq, so y* satisfies 

ra. □ 



Henceforth, we assume any K G D(G) is accompanied by a uniform extremal boundary denoted 
by y{t), and we consider extreme states on the order of t to be {ty{t), oo]. If G({0}) = 0, then 
all positive states are extreme states. We now use the extremal boundary to reformulate the 
convergence of Theorem 13.11 on the larger space [0,oo]™. Put E^(t) = (y(t),oo]™~^ x [0,oo], so 
that E'^{t) t = (0, oo]™-i x [0, oo]. Recall the notation /u^^ and from ([ST]), ([32]) in Theorem 
[3l](p.E|. 

Theorem 3.2. Let ut = u{t) be a non-negative function such that n^— )-n>0 as t oo. Taking 

Ji^^{u, .) =7rW(n, . nE;„(t)), 

we have 

Ji^^{ut, ■) ^ lim{u, ■) in M+[0,oo]"" (t^oo). 

Proof. Note that we can just as well write fim{u , •) = fJ-m {u , • H E^(t)). Suppose m > 2 and let 
/ G C^[0, oo]™. For 5 > 0, define = (5, oo]™^-*^ x [0, 00], and choose 5 such that fimiu , OAs) = 0. 
On the one hand, for large t we have 

Mm i'^t , •) (/) = / f{x) lE'^(t) (x) Mm {ut , dx) > / f {x) 1a, {x) ^'^ {ut , dx) 

J[0,oo]™- Je'^ 

f{x) 1ai,{x) fJ,m{u, dx) 



as t — 00 by Lemma [531 (p- [22]) . Letting 5^0 yields 
(3.10) liminf /lW(nt, •)(/) > /u^(n, •)(/) 
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by monotone convergence. On the other hand, fixing 6, we can decompose the space according to 
the first downcrossing of 6: 

(3.11) Ji^^{ut,-){f)= f{x)lA,{x)Ji^^{ut,dx) + f{x)l^k{x)p^}{ut, dx), 

J[0,oo]'" ^^-j^ J[0,oo]'" " 

where = (5,00]'^^-'^ x [0,5] x [0,00]™^*^. On the subsets we appeal to the bound on /, say 
M, to obtain 



fix) 1^. (x) Jl<^ {ut ,dx)<M {ut , Al) . 

'[0,00]™ 
Now, 

(3.12) {ut , 4) < {ut , {5, oof'' X {y{t), 6]) 

= /.W(ni, {6, oof-' X [0,6])-f^^^ut, {6,oof-' x [0,y(t)]). 
Considering the second term, we have 
f,^\ut,{6,oof-'x[0,y{t)]) 



K{tut, tdxi)l(5_oo](a;i) • • • / K{tXk-2, tdxk-i)l{s,oo]{xk-i) K{txk-i, t[0,y{t)]) 
[0,00] J [0,00] 

= / fJ'tliiut, dxk-i) ht{xk-i), 
where 

ht{xk-i) = K{tXk-i, t[0,y{t)]) l(5^^]fe-i(a;fe_i). 
Moreover, if x^j^_-^ — t- Xj^^i G {6,oof-', then 

(4-i) = ^(*4-i > * [0, y(t)]) l(5,oo]*-i (4-i) ^ G({0}) l(5,oo]fe-i {Xk^l), 

using the fact that y{t) is a uniform extremal boundary. Since fik-i{u , d{6,oof~') = without 
loss of generahty by choice of 6, we conclude that 

/x(*)(nt, {5,^f-' X [0,y{t)]) ^ G{{0}) ■ fik-i{u , {6,^f-')=fXk{u, {6,^f-' x {0}) 
as t ^ 00. Now, let us return to (|3.12p . Given any e > 0, by choosing 5 small enough, we can make 

f,^\ut, {d,oof~' X iy{t),6]) ^fik{u, {6, oof-' x [0,6]) - fik{u , {6,00]'-' x {0}) 
</xfc(n, (0,00]^=-! X [0,6]) -f,k{u, {6,00]'''' x{0}) 

<^,k{u, (0,oo]'=-i X {0}) + I - (^/i,.(n, {0,^]"-' X {0}) ' 



2' ^' 



I.e. 

(3.13) limsup Jl^ {ut , A^) < e, 

for A; = l,...,m— 1. Therefore, (13.1ip implies that, given e' > 0, 

„ m—1 

lim sup pt^^) {ut, • ) (/) < / /(a;) (a;) {u , dx) + M Y] lim sup /I^^ (ut , A^) 

t— s-oo J[0,oo]'" t-^OD 

<fim{u, •)(/) + £' 

for small enough 6. Combining this with (j3.10p yields the result. □ 
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3.3. The Extremal Component. Having thus formalized the distinction between extreme and 
non-extreme states, we return to the question of phrasing a fdd limit result for X when H is 
unspecified. The extremal boundary allows us to interpret the first hitting time of {0} by the tail 
chain as approximating the time of the first transition from extreme down to non-extreme. In this 
terminology, Theorem 13.21 provides a result, given that such a transition has yet to occur. 
Define the first hitting time of a non-extreme state 

T{t) = inf {n > : X„ < ty{t)} . 

For a Markov chain started from tut, where ut — )• u > 0, we have tut > y{t) for large t, so T{t) is 
the first downcrossing of the extremal boundary. 

For the tail chain T, put r* = inf{n > : r„ = 0}. Given Tq > 0, write r* = inf{n > 1 : £,n = 0}, 
where {^n} ~ G are iid and independent of Tq, i.e. r* follows a Geometric distribution with 
parameter p = G({0}). Thus, P[r* = m] = p{l — p)"^~^ for m > 1 if p > 0, and P[r* = oo] = 1 if 
p = 0. Theorem 13.21 becomes 

(3.14) Ptu, [r^X^ e • , r(t) >m]^Pu[T^G-, T*>m], 
implying that r* approximates T{t): 

(3.15) Pt„,[T(t) G •] ^ P[t* G •], {t^oo, ut^u>0). 

So if G({0}) > 0, X takes an average of approximately G({0})^^ steps to return to a non-extreme 
state, but if G({0}) = 0, Ptujri < m] — )• for any m > 1 so starting from a larger and larger 
initial state, it will take longer and longer for X to cross down to a non-extreme state. 

Let T* be the tail chain associated with {G, eo). For ~ G iid and independent of Tq*, 

(3.16) T,*=To*ei---Cn. 

We restate (13.140 in terms of a process derived from X , called the extremal component of X, whose 
fdds converge weakly to those of T* . The extremal component is the part of X whose asymptotic 
behavior is controlled by G alone. 

Definition. The extremal component of X relative to t is the process defined for t > as 

= ■ l{n<r(i)} i n = 0, 1, 

Observe that X^*^ is a Markov chain on [0, oo) with transition kernel 

j^{t)(^ ^,iK{x,An{ty{t),oo])+eo{A)-K{x,[0,ty{t)]) x > ty{t) 
^ \eo{A) x<ty{t) ' 

It follows that K^*\t , t •) ^ G as t oo, and additionally that i^W(t , {0}) G({0}). 
The relation between the component processes X^^\ T* and the complete ones is 

Ptut [t'^X'i^ e • I r{t) >m]= Ptut [t'^Xm G • | r(t) > m] 

and 

Pu[T*^ e ■ \ t* > m] = P^[T^ £ ■ \ t* > m]. 
Theorem 3.3. Let ut = u{t) > satisfy — t- n > as t ^ oo. Then on [0,oo]™, 



t 



)g- ^P4(ri*, ...,r;,)G.] (t^oo). 
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Proof. Suppose m > 2 and / G C[0,oo]™, and assume first that / > 0. Then / G Cj!^[0,oo]'" as 
well, since the space is compact. Recall the notation of Theorem 13. 2i Conditioning on T(t), we can 
write 

{ut, • ) (/) = / /(a^m) {ut , dx^) + V / f{x^) ^t] [ut , dx^ 

„ m 

= / /(a^m) vf^^-' (^it , dxm) + f^^^' 0, . . . , 0) Tlf' {ut , dXk) 

J(0,oo]™ ^(O,oo]fe-ix{0} 

by the Markov property. Since 



n (0,oo]™) = Ptu, G • , T{t) >m]= Ptu, [t-^X^ G • n (y(t),oo] 



= M!^+i(nt, • X [0,oo]) , 
the first term becomes 

■)(/) ^ /im+l(n, •)(/) = / f{Xm) vrm(n, da;^) = / fiXm) Pu[T*m e dXrn] 

J(0,oo]™ J(0,oo]™ 

as t — )• oo. Next, for any A C [0,oo]'^ measurable, write Aq = {a;fc_i : {xk_i , 0) G ^} C [0,00]*^"^, 
and observe that 

(nt , (0,00]'=-^ X {0}) = Ptu, [f'x^l, G n (0,00]'=-^ , = 0] 

= Ptu, [t-^Xk-i eAoD (y(t),oo]^-i , t-^Xk < y{t)] 
= Jx^^ {ut , Ao X [0, 00]) - il^l^ {ut , Ao X [0, oo]2) . 
Applying this reasoning to the terms in the summation yields 

/(ajfc^i, 0, . . . , 0) Jj-k\ut , dxk) - / f{xk-i, 0, . . . , 0) pt^L (n* , dxk+i) 



/ fixk-1,0, ...,0) Hk{u, dxk) - f{xk-i,0, ...,0) fik+i{u, dxk+i) 

J[0,oo]'' J[0,oo]'^+i 

:/ f{Xk,0,...,0)7^k{u,dXk)= f{Xm) Pu[T*^(^ dXr 

J(0,ool'''-ixiO> J('0.ool*-ixlOl-"'-'=+i 



'(O,oo]'''-ix{0} J{0,oo]*-ix{0}'' 

Combining these limits shows that Etutf{'t~^Xm) — > ^ufiT'^), as t — t- c«. Finally, if / is not 
non-negative, then write f = — f- . Since each of /+ and /_ is non-negative, bounded, and 
continuous, we can apply the above argument to each. □ 



4. The Regularity Condition 

Previous work on the tail chain derives fdd convergence of X to T* under a single assump- 
tion analogous to our domain of attraction condition (|2.5p . As we observed in Section 13.11 when 
G({0}) = 0, fdd convergence of {t-^X} follows directly, but when G({0}) > 0, it was common to 
assume an additional technical condition which made (|2.5p imply fdd convergence to T* as well. 
This condition, which we refer to as the "regularity condition," is an asymptotic convergence as- 
sumption prescribing the boundary distribution to be if = eo- We consider equivalences between 
different forms appearing in the literature, in terms of both kernels and update functions, and show 
that, under the regularity condition, the extremal behaviour of X is asymptotically the same as 
that of its extremal component X^^\ 
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In cases where G({0}) > 0, Perfekt [18l |T9] requires that 

(4.1) hm hmsup sup K{tu, {t,oo\) =0, 

(54,0 t~^oo Me [0,5] 

while Segers |23j stipulates that the chosen update function corresponding to K must be of at most 
linear order in the initial state: 

(4.2) limsup sup t~V(2/,'y) < oo, (f G -Bq, P\V^Bq] = 1). 

t—^oa 0<y<t 

Smith [24] used a variant of (14. ip . We deem a formulation in terms of distributional convergence 
to be instructive in our context. 

Definition. A Markov transition kernel K G D{G) satisfies the regularity condition if 

(4.3) i^(tnt,t- ) ^eo(-) 

on [0, cxd] as t — 7- cx) for any non-negative function ut = u{t) — t- 0. 

Note that in (|2.7p (p. H]), we had — )• n > 0. We interpret (|4.3p as designating the boundary- 
distribution H to be eo- 

We now consider the relationships between (j4.ip . (j4.2p and (|4.3p , and propose an intuitive 
equivalent for update functions in canonical form. 

Proposition 4.1. Suppose K G D{G), and let il){-,V) he an update function corresponding to K 
such that 

(4.4) t-^^it^v) ^ i{v) 

whenever v G B for which P[V G B] = 1, and oV G. Then: 

(a) Condition (14. Ij) is necessary and sufficient for K to satisfy the regularity condition (14. 3p . 

(b) Condition (14. 2p is sufficient for K to satisfy the regularity condition (14. 3p . 

(c) If if) is in canonical form, i.e. 

i;{y,{Z,W)) = Zy + ^{y,W), 

then ip satisfies (j4.2p if and only if (j){-,w) is bounded on any neighbourhood of for each 
w gC, a set for which P[W G C] = 1. 

Proof, (a) Assume ()4.ip . and suppose ut — )■ 0. We show K{tut , t{x, oo]) — >■ for any x > 0. Write 

oj{t,5) = sup K{tu , {t,oo\) . 
Me[o,5] 

Let e > be given, and choose 5 small enough that limsupj_^oc a;(t, (5) < e/2. Then for t large 
enough that ut < 5x, we have 

K(^tut , t{x,oo]) < sup K(tu , t{x, oofj = uj(tx, 6) < limsup uj{t, 6) + e/2 

u£[0,5x] t-i-oo 

for t large enough. Our choice of 5 implies that K{tut , t{x,oo]) < e. 

Conversely, assume that K satisfies (|4.3p but that (|4.ip fails. Choose e > and a sequence 
5n i such that limsupj_j.f^ uj{t, 5n) > e for n = 1, 2, . . . . Then for each n we can find a sequence 

— )• oo as A; — )• oo such that a;(t^, 5„) > e for each k. Diagonalize to find ki < k2 < ■ ■ ■ such that 
Sn = — )• oo and a;(s„, 6n) > e for all n. Finally, for n = 1, 2, . . . choose Un G [0, 5n] such that 

K[snUn , (s„, oo]) > Uj{Sn, ^n) " e/2, 

and put u{t) = I]„ '"n l[s„,s„+i)(i)- Clearly u{t) 0, but K{snu{sn), (sn, oo]) > e/2 for ah n, 
contradicting (|4.3p . 
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(b) Write M{v) = limsup^ supo<y<t t~^ip{y,v). Since 

sup t-V(y,-)= sup ^1^^,- 

for (5 > 0, we have 

limsup sup t~^^{ty,v) = 5M{v). 

t~>oD 0<y<5 

Now, suppose ut — )• 0. Given any 6 > we have 

t~^tlj{tut,v) < sup t~^i(j{ty,v) 

0<y<5 

provided t is large enough, so hmsup^ t~^ip{tut, v) < 5M{v). Consequently, limsupj t^^Tp{tut,v) = 
for every v such that M{v) < oo. Under (j4.2p . this means that P[t'^7p{tut,V) O] = 1, implying 
(lOD. 

(c) Suppose first that Xwi^) = supQ^y^^ (j){y , w) < oo for all a > 0, whenever w £ C. Fixing 
w & C and z >0, note that 

sup t^^^lJ{y,{z,w))<z+ sup t^^(t){y,w), 
0<y<t 0<y<t 

and observe for any a > that 

sup r^(l){y,w) < ( sup t"V(2/,^i')) V ( sup y~^(p{y,w)) <r^Xw{a)'^ { sup y~^ (p{y , w) 

0<y<t ^ 0<y<a ' ^ a<y<t ' ^ a<y 

Choosing a large enough that sup„<j^ y~^(j){y,w) < 1, say, it follows that 

limsup sup t~^ilj{y,{z,w))<z + l, 

t^oo 0<y<t 

sov = {z, w) G Bq. Therefore P[{Z, W) E Bq] >P[Z>0,W gC] = 1. 

Conversely, suppose there is a set D with P[W £ D] > such that w £ D implies Xm>(o) = oo 
for some < a < oo. Since supQ^y^-i.t^^ijj{y,{z,'w)) > t^^Xw{t), we have [0,oo) x D C Bq, 
contradicting (14. 2p . □ 



The exclusion of necessity from part (b) results from the fact that a kernel K does not uniquely 
specify an update function ijj. Even when K satisfies the regularity condition (14. 3p . it may be 
possible to choose a nasty update function ijj which satisfies (j4.4p . but not ()4.2p . However, in such 
cases there may exist a different update function ip' corresponding to K which does satisfy ()4.2p . 

Here is an example of such a situation. We exhibit an update function ip for which (i) ()4.4p holds; 
(ii) (j4.2p fails because condition (c) in Proposition 14.11 fails; but yet (iii) the corresponding kernel 
satisfies the regularity condition ()4.3p . Furthermore, we present a different choice of update function 
corresponding to the same kernel which satisfies ()4.2p . Define ^l'{y,V = {Z,W)) = Zy + (l){y,W), 
where 

oo 
k=l 

and W ~ U{0, 1). (i) Since (l){t,w) = for t > 1/w, it is clear that if) satisfies (14. 4 p with ^ = Z. 
(ii) Observe that for any w G (0, 1), (j){-,w) is unbounded on the interval [0, 1]. Therefore, by part 
(c) of Proposition 14. H (14. 2 p cannot hold for ip. (iii) However, the corresponding kernel does satisfy 
the regularity condition ()4.3p . Suppose ut — )• and a > is arbitrarily large. Write 

P[t-^^p{tut,{Z,W)) >x]= P[Zut + t-^(l){tut,W) >x] < P[t-^(l){tut,W) >x'] +P[Z>a], 
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choosing < x' < x — auf. Since for any t, {w : (l){tut,w) > tx'} C {{tutk)^^ : = 1, 2, . . . }, a 
set of measure with respect to P[W G ■], (|4.3p follows by letting a — t- oo. On the other hand, the 
update function Tp'{y, Z) = Zy does satisfy ()4.2p . and for any y, 

P [ij'iy, Z) / {Z, W))] =P[WG {{yk)-^ : A: = 1, 2, ...}]= 0, 

so '0' does indeed correspond to K. 

The regularity condition (14. 3p restricts attention to Markov chains for which the probability 
of returning to an extreme state in the next m steps after falling below the extremal boundary 
is asymptotically negligible. For such chains, as well as those for which y[t) = is an extremal 
boundary for X has the same asymptotic behaviour as its extremal component, as described 
next. 

Theorem 4.1. Suppose X ^ K with K G D{G), and let p be a metric on M™. If y{t) = is an 
extremal boundary for K , or if K satisfies the regularity condition ()4.3p , then for any e > we 
have 

(t — ;> oo, Ui — )• u > 0). 



(4.5) 

Consequently, 



tut 



r 



t 



> e 



(4.6) 



tut 



x„ 



t t 



Pu[{T^, ...,T^)e-] {t^oo, ut^u>0). 



First let us extend the regularity condition to higher-order transition kernels. 

Lemma 4.1. If K satisfies (14. Sp . then so do the m-step transition kernels K"^. 

Proof. This is established by induction. Let ut — )• and / € C[0, oo]. For m > 2, we have 



{tut,.) if) = I K 

[0,oo] 



m—1 , 



tut , tdv) I K(tv, tdx) f{x). 

[0,oo] 



Assume that ^{tut , t •) =^ ^o; (14. 3p implies that / K{tvt , tdx) f{x) — > /(O) whenever vt — > 0. 
Therefore, by Lemma [52] (a) (p. [2T]) . we conclude that 

i^™(tnt,-)(/)^/(0) = eo(/). □ 

Proof of Theorem \4-l\ Suppose e > and ut — ?■ u > 0. Write 

m 

Ptut W'^X^^ , t-^X^) >e]=Y, Pt«* [p{t-^X^^ , t-^X^) > e , r(i) = k] . 



k=l 



Since Xj = X^p while j < T(t), for the A;-th summand to converge to 0, it is sufficient that 

Ptut [\xf/t - X,/t\ > 6 , T{t) = k]= Ptut [Xj/t > 6 , T{t) = k]^0 
for j = k, . . . ,m and any 6 > 0. If j = k, we have 

Ptut [Xj/t > 6 , r(t) = k]< Ptut [Xk/t > 6 , Xk/t < y{t)] = 
for large t. For j > k, recalling the notation of Theorem! 



Ptu, [Xj/t > 6 , T{t) = k]= lro,y(t)](xfc) Ptut [Xj/t > 5\Xk/t = Xk] Ptut [Xk/t G dxk 

= / Ptx^ [Xj-k > t6] l[0,y(t)] {xk) J^'k i^t ' d^k) 

J[0,oo]'' 
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using the Markov property. We claim that this intergral — t- as t — t- oo. If y{t) = 0, this 
follows directly. Otherwise, recall that Jl^j^\ut , •) A fJ^k{u, •), and consider ht{xk) = Ptx^l^j-k > 
t6] l[o^y(t)] (2;^). Suppose a;^*) — > a; G [0, oo]'^. If Xk > 0, then ht{x^*^) = for large t because y{t) — )• 
0. Otherwise, if Xk = 0, we have ht{x^^^) — )• since Lemma [HT] implies that P (t)[Xj^k > t6] ^ 
as t — 7- 00. Lemma 18.21 (b) establishes (j4.5p : (j4.6p follows by Slutsky's theorem. □ 

Therefore, X converges to T* in fdds under (a) G({0}) = 0, (b) G{{0}) > combined with (j4.3p . 
or (c) G({0}) > combined with the extremal boundary y{t) = 0. In either case, we will be able 
to replace the extremal component JC*-*^ with the complete chain X in the results of Sections 15.11 
and 15. 2i However, that y{t) = is an extremal boundary, and consequently that (|4.6p holds, does 
not imply the regularity condition to hold, regardless of G({0}); in particular, a kernel for which 
G({0}) = need not satisfy (|4.3p . This is illustrated in Example 16.31 

5. Convergence of the Unconditional FDDs 

5.1. Effect of a Regularly Varying Initial Distribution. So far our convergence results re- 
quired that the initial state become large, and the only distributional assumption was that the 
transition kernel K determining X be attracted to some distribution G. To obtain a result for the 
unconditional distribution of {Xq, . . . ,Xm), we require an additional assumption about how likely 
the initial observation Xq is to be large. Using Lemma 18.41 the results of the previous sections 
extend to multivariate regular variation on the cone = (0, 00] x [0, oo]™" when the distribution 
of Xq has a regularly varying tail. This cone is smaller than the cone [0,oo]'"+^\{0} traditionally 
employed in extreme value theory, because the kernel domain of attraction condition (|2.5p is unin- 
formative when the initial state is not extreme. This is analogous to the setting of the Conditional 
Extreme Value Model considered in [H [13] . 

Proposition 5.1. Assume X ^ K with K G D[G), and Xq ~ H , where H is a distribution on 
[0, 00) with a regularly varying tail. This means that as t ^ 00, for some scaling function b{t) — t- 00, 

tH{b{t) ■) Vai-) in M+ (0,00], 
where Vaix, 00] = x~" and a > 0. Define the measure v* on = (0,cxd] x [0,c»]™' by 
(5.1) T^*{dXQ, dXm) = faidXQ) [(Ti , . . . , T^) G dxm] . 

Then, for m = 1, 2, . . ., the following convergences take place as t —)• 00. • 

(a) /nM+((0,oo]" x [0,oo]), 

tP[b{t)-\Xo, Xi, ... , Xm) G • n (0,00]"^ X [0,00]] ^ ■ n(o,oo]" x [0,00]). 

(b) In M+(E^), 

i P[6(t)-\V(^(*« , vfW) , . . . , vi^W)) G •] ^ .*{■). 

(c) // either G{{0}) = 0, y{t) = is an extremal boundary, or K satisfies the regularity 
condition (14. 3p . then in M+(Em), 

tP[b{t)-\XQ,Xi, ... , V„)G-] 

(d) /nM+(0,oo], 

tP[XQ/b{t) G dxQ , r{b{t)) > m] ^ (1 - G({0}))"~' • z.,(dxo). 
Remark. These convergence statements may be reformulated equivalently as, say, 

p[b{t)'\xQ,Xi, . . . ,x^) G • I Vo > 6(t)] ^ p[(r*, r;, . . . , r;,) G •], 

where Tq ~ Pareto(a). This is the form considered by Segers [23]. 
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Proof. Apply Lemma [831 (p- ESI) to the results of Theorems EH [33] and \TJ\ and ([3131) . □ 

In the case m = 1 , Ei is a rotated version of En used in the conditional extreme value model in 
[HI [9] and the limit can be expressed as 

;>oo 

z/*((xo,oo] X [0,xi]) = / iya{du)P[£, < xi/u] = x^"" < xi/xo] - xj"" E^"l{g<3,^/^Q} 

Jxo 

for (xo,xi) G (0,oo] X [0, oo], where ^ ~ G (with E^" < oo). Since 

i/*((xo,oo] X {0}) = Xo"P[C = 0] and z^* ((0, oo] x (xi, oo]) = x^" Ee"*, 

sets on the xo-axis incur mass proportional to G({0}), and sets bounded away from this axis are 
weighted accordng to E^". A consequence of the second observation is that 

liminf t PrXi/6(t) > xl > E^" • x"". 

t—>-oo 

Thus, knowledge concerning the tail behaviour of Xi imposes a restriction on the distributions G 
to which K can be attracted via the a-th moment. For example, if tP[Xi/b{t) G •] A Uq., then 
we must have E,^" < 1; this property will be examined further in the next section and appears in 
various forms in Segers |23] and Basrak and Segers [T|, in the stationary setting. 

5.2. Joint Tail Convergence. What additional assumptions are necessary for convergences (b) 
and (c) of the previous result to take place on the larger cone Ej^ = [0, oo]'"^^\{0}? This was 
considered by Segers [Il[23] for stationary Markov chains. In (b), the dependence on the extremal 
threshold and hence on t means we are in the context of a triangular array and not, strictly speaking, 
in the setting of joint regular variation. However, the result is still useful, for example, to derive a 
point process convergence via the Poisson transform [2l, p. 183]. 

As a first step, we characterize convergence on the larger cone by decomposing it into smaller, 
more familiar cones. This is similar to Theorem 6.1 in [23] and one of the implications of Theorem 
2.1 in [Ij. As a convention in what follows, set [0, oo]'^ x A = A. Also, recall the notation E^ = 
(0,oo] X [0,oo]™. 

Proposition 5.2. Suppose Yt = (^,o ; ^,i ^,m) is a random vector on [0,00]™"^^ for each 

t > 0. Then there exists a non-null Radon measure fi* on EJ^ = [0, oo]'"^"'^\{0} such that 

(5.2) tP[{Yt,o,Yt,u ...,Yt,m)(^-] ^ l^*{-) inM+{El) [t ^ 00) 

if and only if for j = 0, . . . ,m there exist Radon measures on Ej = (0, 00] x [0, oo]-', not all null, 
such that 

(5.3) tP[iYtj , . . . , Yt^m) G •] ^ /im-i(-) in M+(E„_,-). 
The relation between the limit measures is the following: 

tJ-m-ji-) = fJ-*{[0,ooy X •) on Em-j 

for j = 0, . . . ,m, and 

m 

IJL*{[0,xf) = ^/im_j((xj,oo] X [0,Xj+i] X • • • X [0,Xm]) for x G E;;^. 

j=0 

Furthermore, given j G {0, ...,m — 1}, if A d [0, oo]™~-'\{0}"^~-' is relatively compact, then 
^m_j((0,oo] X ^) < 00. 
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Proof. Assume first that (|5.2p holds. Fixing j G {0, . . . ,m}, define jim-ji') '■= Ai*([0, oo]-' x • ) (i.e. 
/^m = A**)- Let A C Em-j be relatively compact with ^m-jidA) = 0. Then A* = [0,oo]-' x A 
is relatively compact in EJ^, and Oe^A* = [0,cxd]-' x dE^_jA, so /x*(3e^^*) = iJ,m-j{dA) = 0. 
Therefore, 

t P [(y^,, , . . . , Yt,m) eA]=tP [{Yt,o , . . . , Yt,rn) G ^ /"*(^*) = l^m-j{A), 

establishing (j5.3p . 

Conversely, suppose we have (|5.3p for j = 0, . . . , m. For a; G (0, 00]™+^, define 

m 

h{x) = y^^m_j((Xj,00] X [0,Xj + l] X ••• X [0, Xm]). 

i=o 

Decompose [0, disjoint union 

m 

(5.4) [0, xf = IJ [0, oo]-' X (xj, 00] X [0, Xj+i] x • • • x [0, x^] , 
and observe that at points of continuity of the limit, 

m 

(5.5) tP[Yte[Q,xf]=Y,i^ [{yt,j , . . . , yt,m) G {xj , 00] X [0, xj+i] x • • • x [0, x^] ] ^ /i(a;) . 

Hence, (|5.2p holds with the limit measure ^* defined by ^*([0,a;]'^) = /i(a;). Indeed, given / G 
C]^(E5!^) we can find 5 > such that x^ = {5, ... ,5) is a continuity point of h and / is supported 
on [0,a;(5]'^. Therefore, 

t^f{Yt) < sup f{x) ■ suptP[Yt G [0,xsT] < 00, 

tceE^ i>0 

implying that the set {tP[l^t G t > O} is relatively compact in M+(E^). Furthermore, if 
tk P[^tfc G • ] — )• /i and P[^Sfc G • ] — )• /i' as A; — )• 00, then = /i' = /i* on sets [0, x]'^ which are 
continuity sets of /x* by (j5.5p . This extends to measurable rectangles in bounded away from 
whose vertices are continuity points of h, leading us to the conclusion that /j, = fi' = n* on EJ^. 

Moreover, since we can decompose [0, x]^ for any x G E^ as in (j5.4p . it is clear that n* is non-null 
iff not all of the Hj are null. 

Finally, for 1 < j < m — 1, if A C [0, oo]"*~-'\{0}™'~-' is relatively compact, then it is contained 
in [(0, ... ,0), (xj+i, . . . ,Xm)Y for some (xj+i, . . . ,Xm) S (0,oo]™'~-'. Applying (15. 4p once again, we 
find that 

/im-j((0,oo] X A) = ^*([0,oo]-' X (0,00] X A) 

m 

< Ai*([0,oo]^+' X [0,oo]^-^'-i X (xfc,oo] X [0,Xfc+i] x ••• x [0,x„]) 

k=j+i 

m 

= /Xm-fc((Xfc,00] X [0,XA..+i] X • • • X [0,Xm,]) < 00. □ 

k=j+l 

Consequently, the extension of the convergences in Proposition 15.11 to the larger cone E,^ follows 
from regular variation of the marginal tails. 

Theorem 5.1. Suppose X ^ K £ D{G), and let b{t) 00 be a scaling function and a > 0. Then 

(5.6) tP[6(t)-i(xi'(*», X(,^W)) G •] ^/x*(.) tn M+{E*J (t ^ 00), 
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where 



* 



{dxQ, dx) 



a 



{dxQ)?^,[{Tt, 



i^*{dxQ, dx) 



if and only if 




tP[Xo> b{t)x] iy*{{x,oo] X [0,00]"^) = 
for X > 0. Hence, b{t) € RVi/^, so by (15. 6p again, we have for j > I 



and 



Cj > ^*((0,oo] X [0,ooy-^ X (1,00] X [0,00]""^^) = I Ua{du) P[6---0 > u-^] 



i(0,oo] 



= E(a---Cir = (EeT. 



Conversely, suppose that (|5.7p holds for j = 0, . . . ,m. Lemma 18.41 implies that in M+(Em--j), 



At the end of Section HI cases were outlined in which we could replace Xr " by Xi. Theorem 
15.11 is most striking for these since it shows that for a Markov chain whose kernel is in a domain 
of attraction, to obtain joint regular variation of the fdds it is enough to know that the marginal 
tails are regularly varying. In particular, if X has a regularly varying stationary distribution then 
the fdds are jointly regularly varying. This result was presented by Segers [23j, and Basrak and 
Segers [l] showed that for a general stationary process, joint regular variation of fdds is equivalent 
to the existence of a "tail process" which reduces to the tail chain in the case of Markov chains. 
However, what Proposition 15.11 emphasizes is that it is the marginal tail behaviour alone, rather 
than stationarity, which provides the link with joint regular variation. 

Theorem 15.11 also extends the observation made in Section 15.11 that knowledge of the marginal 
tail behaviour for a Markov chain whose kernel is in a domain of attraction constrains the class 
of possible limit distributions G via its moments. If a particular choice of regularly varying initial 
distribution leads to t P\Xj > h(f) ■ ] A ajUai-), then we have E^" < a]^"'- In particular, if X admits 
a stationary distribution whose tail is RV_q,, then E^" < 1. 

6. Examples 
Our first example illustrates the main results. 

Example 6.1. Let V = {Z, W) be any random vector on [0, 00) x M. Consider the update function 
ip{y, V) = {Zy + W)j^ and its canonical form 



For y > 0, the transition kernel has the form K{y , (x, 00)) = P [Zy + W > x\. Since t~^ip{t, V) = 
{Z + t^^W)^ — )• Z a.s., we have K S D{G) with G = P[Z G • ]. Furthermore, using Proposition l3.lt 
the function j{t) = -v/t is of larger order than (j){t, w), so y{t) = l/-v/t is an extremal boundary. Since 




□ 



iPiy, V) = Zy + <l}{y, W) = Zy + {W l{w 



Zy} - Zyl{w<-Zy})- 
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4>{-,w) is bounded on neighbourhoods of 0, Proposition 14.11 (c) impHes K satisfies the regularity 
condition (j4.3p . Consequently, from Theorem I4.H we obtain fdd convergence of t~^X to T* as in 
TO . □ 

If K does not satisfy the regularity condition (14. 3p , Theorem 14.11 may fail to hold and starting 
from tu, t^^X may fail to converge to T* started from u. 

Example 6.2. Let V = (Z, W, W) be any non-degenerate random vector on [0, oo)'^, and consider 
the Markov chain determined by the update function 

i:{y,V) = Zy + Wy-^ 1{^>0} + W l|j,=o} • 

For y > 0, the transition kernel is K{y , (x,oo)) = P[Zy + Wy~^ > x] and since t^^iJj{t,V) = 
Z + Wt~^ — ;> Z a.s., we have K G D{G) with G = P[Z e •]. Furthermore, using Proposition l3.lt 
the function 7(t) = 1 is of larger order than (f)(t,w), so y{t) = 1/t is an extremal boundary. 

However, note that {W, W')) = Wy~^l^yyoj + l^'l{y=o} is unbounded near 0, implying that 
Segers' boundedness condition (j4.2p does not hold. In fact, our form of the regularity condition 
([O]) fails for K. Indeed, 

K{tut , t{x,oo)) = P[Ztut + W/{tut) > tx] = P[Zut + W/{t^ut) > x]. 

Choosing ut = yields K{tut , t{x,co)) — )• P\W > x]. For appropriate x, this shows (|4.3p fails. 

Not only does (j4.3p fail but so does Theorem 14. H since the asymptotic behaviour of X is not the 
same as that of X We show directly that the conditional fdds oit~^X fail to converge to those of 
T* . The idea is that if Xk < y{t) = t~^, there is a positive probability that X^+i > t. We illustrate 
this for m = 2. Take / € C[0, oo]^ and ti > 0. Observe if Xq = tu > 0, from the definition of xp, 
Xi = Zitu + Wi/{tu) and X2 = Z2X1 + {W2/ Xi)l{x,>o} + Wl{Xi=o}- Furthermore, on {Zi > 0}, 
we have Xi > and X2 = Z2X1 + W2/X1. On {Zi = 0,Wi > 0}, Xi > and X2 = Z2X1 + W2/X1. 
On {Zi = 0, 1^1 = 0}, we have Xi = and X2 = W . Therefore 

Etu/(Xi/t, X2/t) = EtufiXi/t, X2/t) l{Zi>o} + Etuf{Xi/t, X2/t) l{Zi=o,iyi>o} 

+ Et„/(Xi/t, X2/t) l|^,=o,H/i=o} = A + B + C. 

For A, as t — )• 00, we have 

A = Ef{Ziu + Wi/{t\), Z2[Ziu + Wi/it'u)] + W2/[Zit'u + Wiu-^]) l{z,>o} 
— > E/(Ziu, Z1Z2U) l{Zi>o}, 
while for B we obtain for t — t- 00, 

B = Ef{Wi/eu, Z2Wi/{eu) + W2U/W1) l{Zi=o,m>o} E/(0, UW2/W1) l{Zi=o,m>o}- 
Finally for C, 

C = E/(0, W!2/t) l{z^=o,m=o} = P[^i = 0,1^1 = 0] E/(0, W^/t) P[Zi = 0,T^i = 0] /(O, 0). 

Observe that limt_,oo[^ + B + C] ^ E„/(r;, T^) = 6/(7x^1,^^1^2). □ 

In the final example, the conditional distributions of t~^X converge to those of the tail chain 
T*, even though the regularity condition does not hold. This includes cases for which G({0}) = 
and G({0}) > with extremal boundary y{t) = 0. 

Example 6.3. Let {(^j,?7j), j > 1} be iid copies of the non-degenerate random vector on 
[0,00)^. Taking V = (C) consider a Markov chain which transitions according to the update 
function 

i^{y, y) = i{y + y~^) i{y>o} + i{s/=o} =iy+{iy~^ i{s/>o} + ^ i{?/=o}) , 
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where the last expression is the canonical form. For y > 0, the transition kernel is 

K{y , [0, x]) = P [ay + y-') <x]=P[C< x/{y + y-')] . 

For t > 0, t~'^ij{t,V) = C(l + t-^) C a.s., so K £ D{G) with G = P[^ G •]. Note that 
0(y, V) = £,y^^ -'-{j/>o} +^ l{y=o} is unbounded near 0, implying that Segers' boundedness condition 
(|4.2p does not hold. Also, our regularity condition (14.3P fails for K. To see this, write 

K{tut , t[x, oo)) = P > a;/(nt + (t^ut)-^)] . 

Fix X so that P[.^ > x] > and choose ut = t~^. This yields ut + (t'^ut)^^ = 1 + implying that 

K{tut , oo)) = P > x/{l + r^)] > P[^ > x] > 0, 

so (1331) fails for K. However, since K{t, {0}) = P[^ = 0] = G({0}), the choice y{t) = satisfies 
the definition of an extremal boundary (j3.6p . even if G({0}) > 0. This leads to fdd convergence of 
Ptu[i^^X € •] to Pu[T* S •], and thus we learn that the conclusion (|4.6p of Theorem 14 . 1 1 may hold 
without (14. 3j) being true. 

We prove the fdd convergence for m = 2. For n > 0, and Xq = tu, we have Xi = ^i(tn+ (tu)^^) 
and X2 = 6(^1+^1"') l{Xi>o}+^2 l{Xi=o}- On {Xi > 0} = {^i > 0} we have X2 = 6(^1 +^r'). 
On {Xi = 0} = {^1 = 0}, we have X2 = r/2- Thus, as t — )• 00, 

EtufiXjt, X2/t) l{x,>o} = Et„/(6[n + {t^r'], a^i[u + (t^)-^] + C2/m\ + 1/n])) l{x,>o} 
while 

Etufixjt, X2/t) i{x,=o} = E/(o, m/t) i{a=o} P[6 = o] /(o,o). 

We conclude that 

EtufiXi/t, X2/t) E/«i, <ie2) = EufiTl, T^). □ 
7. Concluding Remarks 

We have thus placed the traditional tail chain model for the extremes of a Markov chain in 
a more general context through the introduction of the boundary distribution H as well as the 
extremal boundary. A common application of the tail chain model is in deriving the weak limits 
of exceedance point processes for X [H [I8l [22]. We will shortly use our results to develop a 
detailed description of the clustering properties of extremes of Markov chains by means of such point 
processes. Furthermore, as we have not employed stationarity in our finite-dimensional results, we 
propose to substitute the inherent regenerative structure of a Harris recurrent Markov chain for 
the traditional assumption of stationarity. Also, it would be interesting to explore the implications 
of choices of H other than eo. 

8. Appendix: Technical Lemmas 

This section collects lemmas needed to prove convergence of integrals of the form J /„ d/in, 
assuming that fn^f and Hn ^ fJ- iia. their respective spaces. An example is the second continuous 
mapping theorem [21 Theorem 5.5, p. 34]. 

Lemma 8.1. Assume E and E' are complete separable (cs) metric spaces, and for n > 0, hn : 
E — 7- E' are measurable. Put A = {x € K : hn{xn) — ho{x) whenever Xn x}. If Pn, n > 
are probability measures on E with P„ ^ Pq, and hn — >• ho almost uniformly in the sense that 
P{A) = 1, then Pn o h'^ =^ Pqo h^^ in E'. 

The result provides a way to handle the convergence of a family of integrals. 

Lemma 8.2. In addition to the assumptions of Lemma \8.1\ require E' = M and {hn, n > 0} is 
uniformly bounded, so that sup„>Q sup^.gjg |/in(2;)| < 00. 
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(a) We have 

hn dPn — > I ho dPo. 



I 

Je 



IE JE 

(h) Suppose additionally that E is locally compact with a countable base (Iccb), and Hn — > /^o 
M_|_(E) with ^q{A^) = 0. // there exists a compact set B € /C(E) with ^^[dB) = such that 
hn{x) = 0, n > whenever x ^ B (i.e. B is a common compact support of each hn), then 

hn dfin — > / ho dflo- 
Je 

Proof, (a) If Xn ~ Pn for n > 0, then =^ ^o(-'^o)- The uniform boundedness of the hn 

guarantees that Ehn{Xn) — )• Eho{Xo). 

(b) View S as a compact subspace of E inheriting the relative topology. Then, assuming 
n{B) > to rule out a trivial case, define probabilities on B by Pni-) = /Un( • H B)/nn{B), n > 0. 
Since -DB) — % no{ -nB) by Proposition 3.3 in [12j, and B is compact, we get P„ =^ Pq- Denote 
by h'^, n > 0, the restriction of /i„ to B. Observe that for any x G ACi B, we have h'^{xn) — >• h'{x) 
whenever — )• x in B, and P{A'^ n i?) < fj,{A^) / iJ,{B) = 0. Therefore, apply part (a) to obtain 

/ hn d^in = hnlB dfJ-n = tJ-n{B) / h'n dPn > /^o(^) / h^ dPo = ho dflQ. □ 

Je Je Jb Jb Je 

A convenient specialization of Lemma 18.21 (b) is the following. 

Lemma 8.3. Suppose E is Iccb and fin in M4_(E). If f is continuous and bounded, 

and B gK is relatively compact with fJ^{dB) = 0, then 



f dfin — > f d/i. 
b Jb 

Take hn = flB for n > 0. Since fls is continuous except possibly on dB, we have fJ-iA^) < 
fi{dB) = 0. 

The next result is used to extend convergence of substochastic transition functions to multivariate 
regular variation on a larger space. 

Lemma 8.4. Let E C [0,oo]™ and E' C [0,00]"" be two nice (Iccb) spaces. Suppose for t > 
that {p^^\- , ■)}t>o, <ire substochastic transition functions on K x B(E'). This means p^^\- , B) 
is a measurable function for any fixed B G i3(E'), p^^\x , •) is a measure for any x G E, and 
supj>o sup„gE p^^^u, E') < 1. Assume there is a set ^ C E such that 

(ui , • ) ^ (m , • ) in M+ (E') {t 00) 

whenever ut ^ u in and u €z A. Suppose also that {h'^^^}t>o o-fe measures on E such that 
1/(0) = 0, and u^^ 4 i/W in M_(_(E). Then, defining measures /i^*-* for t > on E x E' as 



it) 



[du,dx) = v^*\du)p^^'' [u , dx) 



we have 

^ in M+(E X E') (t ^ 00). 

Proof. Let / G C^(E x E'); without loss of generality assume / is supported on K x K' , where 
K G /C(E) and K' G /C(E'). We have 

'{du,dx) f{u,x) = / v^ >{du) , dx) f{u, x). 

ExE' Je Je' 
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For t > 0, write 

(pt{u) = / p^^\u, dx) f{u,x) 

JE' 

and suppose ut — uq with uq G A; we verify that ^pt{ut) — )• 99o(tio)- Writing gt{x) = f{ut,x), t>0, 
we have gt{xt) — >■ go{xQ) whenever xt — >• G E' by the continuity of /. Also, the gt are uniformly 
bounded by the bound on / and gt{x) = for all t whenever x ^ K' . Furthermore, without loss of 
generality we can assume that p^'^^u , dK') = 0. Now apply Lemma 18.21 (b) to obtain 

ft{ut) = / p^^\ut, dx) gt{x) — > / p(°)(n, dx) go{x) = (fo{u). 

JE' JE' 

Since the p*-*-* are substochastic, and 924 (n) = for all t whenever u ^ K, the ipt ai"e uniformly 
bounded by the bound on / . Assume similarly that v{dK) = 0, and recall that v{A'^) = 0. Apply 
Lemma 18.21 (b) once more to conclude as t — )• 00 that 

f p,^^\du,dx) f{u,x) = [ z^W(dn) vPt(n) ^ / i^^^^^du) ipo{u) = [ fi^'^\du,x) f{u,x). □ 
JexE' JE Je JexE' 

We conclude this section with a result used to verify the existence of the extremal boundary. 

Lemma 8.5. Suppose Pt, t > are probability measures on a cs metric space E such that Pt =^ Pq, 
and let A be measurable. Then there exists a sequence of sets At A such that Pt{At) Po(^)- 

Remark. Note that if P{dA) = then we can take At = A. In the case of distribution functions 
Ft ^ F on M™", taking A = (—00, x] and metric p = poo shows that for any x G M™ there exists 
Xt Ix such that Ft{xt) — > -^(a^)- 

Proof. Let p be a metric on E, and consider sets A^ = {x : p{x, A) < S}. Recall that PQ{dAs) = 
for all but a countable number of choices of 5, since F(6) = Po{Ag) — Po{A) is a distribution 
function. First choose {6k : k = 1,2, . . .} such that < 6^+1 < /\l/{k + 1) and Po{dAsf,) = 
for all k. Next, let sq = and take Sk > Sfc-i + 1, k = 1,2, . . . such that Pt{As^) > PoiA) — 1/k 
whenever t > s^] this is possible since Pt{Asi,) — Po{Asi,) > Po{A) for all k. Finally, for t > set 

00 

A{t) = As, l(0,si)(0 + E ^^fc Ms,,s,+i)it)- 
k=l 

We claim that A{t) | A and that Pt{A{t)) Po(A) as t 00. It is clear that A{t) D A{t') for 
t < t' , and Cit A{t) = Cik As^ = A. On the one hand, for large t we have A{t) C A^,, for any k, so 

limsup Pt[A{t)) < limsup Pt[As^) < PoiAsJ. 

Letting A; — )• 00 shows that limsup^ Pt{A(t)) < Po{A). On the other hand, if k{t) denotes the value 
of k for which s^ <t < s^+i, then 

PtiA{t)) = Pt{As,^,,) > Po(A) - 1/kit), 
so liminff P((^(t)) > Pq{A). Combining these two inequalities shows that Pt{A{t)) — )• Po{A). □ 
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